Automated Article Links Identification for Web-based Online Medical Journals

نویسنده

Daniel X. Le

چکیده

As part of research into Web-based document analysis including Web page downloading and classification, an algorithm has been developed to automatically identify article links in Web-based online journals. This algorithm is based on feature vectors calculated from attributes and contents of links extracted from HTML files, and an instancebased learning algorithm using a nearest neighbor methodology to identify article links. The performance of the algorithm has been evaluated using a sample size of several thousand HTML links of Web-based medical journals. Evaluation shows that the algorithm is capable of identifying article links at an accuracy greater than 99 %.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Medical Citation Records Creation for Web-Based On-Line Journals

With the rapid expansion and utilization of the Internet and Web technologies, there is an increasing number of on-line medical journals. On-line journals pose new challenges in the areas of automated document analysis and content extraction, database citation records creation, data mining, and other document related applications. New techniques are needed to capture, classify, analyze, extract...

متن کامل

Automated Document Labeling

An increasing number of publishers are using the Internet and the World Wide Web to provide their subscribers with access to online journals. New techniques are needed to capture, classify, analyze, extract, modify, and reformat Web-based document information for computer storage, access, and processing. An R&D division of the National Library of Medicine (NLM) is developing an automated system...

متن کامل

Image flip CAPTCHA

The massive and automated access to Web resources through robots has made it essential for Web service providers to make some conclusion about whether the "user" is a human or a robot. A Human Interaction Proof (HIP) like Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA) offers a way to make such a distinction. CAPTCHA is a reverse Turing test used by Web serv...

متن کامل

How to find the best evidence.

The Internet has made finding evidence for clinical practice fairly easy. Many different types of databases that can be searched for relevant key terms are available for free or for subscription. Bibliographic or library databases contain books, book chapters, reports, citations, abstracts, and either the full text of the articles indexed or links to the full text. Citation databases are specia...

متن کامل

Automated Cleanup Processing for Extracting Bibliographic Data from Biomedical Online Journals

An R&D division of the National Library of Medicine (NLM) has developed the Web-based Medical Article Records System (WebMARS) to create citations from online biomedical journals. This paper presents one important part of this system, the automated cleanup module that extracts bibliographic information from HTML-formatted text based on a rule-based approach. A learning scheme comparing the outp...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Automated Article Links Identification for Web-based Online Medical Journals

نویسنده

چکیده

منابع مشابه

Automated Medical Citation Records Creation for Web-Based On-Line Journals

Automated Document Labeling

Image flip CAPTCHA

How to find the best evidence.

Automated Cleanup Processing for Extracting Bibliographic Data from Biomedical Online Journals

عنوان ژورنال:

اشتراک گذاری